Context-dependent feature analysis with random forests

نویسندگان

  • Antonio Sutera
  • Gilles Louppe
  • Vân Anh Huynh-Thu
  • Louis Wehenkel
  • Pierre Geurts
چکیده

In many cases, feature selection is often more complicated than identifying a single subset of input variables that would together explain the output. There may be interactions that depend on contextual information, i.e., variables that reveal to be relevant only in some specific circumstances. In this setting, the contribution of this paper is to extend the random forest variable importances framework in order (i) to identify variables whose relevance is context-dependent and (ii) to characterize as precisely as possible the effect of contextual information on these variables. The usage and the relevance of our framework for highlighting context-dependent variables is illustrated on both artificial and real datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Banzhaf Random Forests

Random forests are a type of ensemble method which makes predictions by combining the results of several independent trees. However, the theory of random forests has long been outpaced by their application. In this paper, we propose a novel random forests algorithm based on cooperative game theory. Banzhaf power index is employed to evaluate the power of each feature by traversing possible feat...

متن کامل

Image Categorization Using Scene-Context Scale Based on Random Forests

Scene-context plays an important role in scene analysis and object recognition. Among various sources of scene-context, we focus on scene-context scale, which means the effective scale of local context to classify an image pixel in a scene. This paper presents random forests based image categorization using the scene-context scale. The proposed method uses random forests, which are ensembles of...

متن کامل

CARAF: Complex Aggregates within Random Forests

This paper presents an approach integrating complex aggregate features into a relational random forest learner to address relational data mining tasks. CARAF, for Complex Aggregates within RAndom Forests, has two goals. Firstly, it aims at avoiding exhaustive exploration of the large feature space induced by the use of complex aggregates. Its second purpose is to reduce the overfitting introduc...

متن کامل

Random Forests of Binary Hierarchical Classifiers for Analysis of Hyperspectral Data

Statistical classification of hyperspectral data is challenging because the input space is high in dimension and correlated, but labeled information to characterize the class distributions is typically sparse. The resulting classifiers are often unstable and have poor generalization. A new approach that is based on the concept of random forests of classifiers and implemented within a multiclass...

متن کامل

Context-Dependent Data Envelopment Analysis-Measuring Attractiveness and Progress with Interval Data

Data envelopment analysis (DEA) is a method for recognizing the efficient frontier of decision making units (DMUs).This paper presents a Context-dependent DEA which uses the interval inputs and outputs. Context-dependent approach with interval inputs and outputs can consider a set of DMUs against the special context. Each context shows an efficient frontier including DMUs in particular l...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1605.03848  شماره 

صفحات  -

تاریخ انتشار 2016